The search functionality is under construction.

Keyword Search Result

[Keyword] supervised learning(66hit)

41-60hit(66hit)

  • An Unsupervised Opinion Mining Approach for Japanese Weblog Reputation Information Using an Improved SO-PMI Algorithm

    Guangwei WANG  Kenji ARAKI  

     
    PAPER-Data Mining

      Vol:
    E91-D No:4
      Page(s):
    1032-1041

    In this paper, we propose an improved SO-PMI (Semantic Orientation Using Pointwise Mutual Information) algorithm, for use in Japanese Weblog Opinion Mining. SO-PMI is an unsupervised approach proposed by Turney that has been shown to work well for English. When this algorithm was translated into Japanese naively, most phrases, whether positive or negative in meaning, received a negative SO. For dealing with this slanting phenomenon, we propose three improvements: to expand the reference words to sets of words, to introduce a balancing factor and to detect neutral expressions. In our experiments, the proposed improvements obtained a well-balanced result: both positive and negative accuracy exceeded 62%, when evaluated on 1,200 opinion sentences sampled from three different domains (reviews of Electronic Products, Cars and Travels from Kakaku.com). In a comparative experiment on the same corpus, a supervised approach (SA-Demo) achieved a very similar accuracy to our method. This shows that our proposed approach effectively adapted SO-PMI for Japanese, and it also shows the generality of SO-PMI.

  • A New Meta-Criterion for Regularized Subspace Information Criterion

    Yasushi HIDAKA  Masashi SUGIYAMA  

     
    PAPER-Pattern Recognition

      Vol:
    E90-D No:11
      Page(s):
    1779-1786

    In order to obtain better generalization performance in supervised learning, model parameters should be determined appropriately, i.e., they should be determined so that the generalization error is minimized. However, since the generalization error is inaccessible in practice, the model parameters are usually determined so that an estimator of the generalization error is minimized. The regularized subspace information criterion (RSIC) is such a generalization error estimator for model selection. RSIC includes an additional regularization parameter and it should be determined appropriately for better model selection. A meta-criterion for determining the regularization parameter has also been proposed and shown to be useful in practice. In this paper, we show that there are several drawbacks in the existing meta-criterion and give an alternative meta-criterion that can solve the problems. Through simulations, we show that the use of the new meta-criterion further improves the model selection performance.

  • Analytic Optimization of Adaptive Ridge Parameters Based on Regularized Subspace Information Criterion

    Shun GOKITA  Masashi SUGIYAMA  Keisuke SAKURAI  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E90-A No:11
      Page(s):
    2584-2592

    In order to obtain better learning results in supervised learning, it is important to choose model parameters appropriately. Model selection is usually carried out by preparing a finite set of model candidates, estimating a generalization error for each candidate, and choosing the best one from the candidates. If the number of candidates is increased in this procedure, the optimization quality may be improved. However, this in turn increases the computational cost. In this paper, we focus on a generalization error estimator called the regularized subspace information criterion and derive an analytic form of the optimal model parameter over a set of infinitely many model candidates. This allows us to maximize the optimization quality while the computational cost is kept moderate.

  • Generalization Error Estimation for Non-linear Learning Methods

    Masashi SUGIYAMA  

     
    LETTER-Neural Networks and Bioengineering

      Vol:
    E90-A No:7
      Page(s):
    1496-1499

    Estimating the generalization error is one of the key ingredients of supervised learning since a good generalization error estimator can be used for model selection. An unbiased generalization error estimator called the subspace information criterion (SIC) is shown to be useful for model selection, but its range of application is limited to linear learning methods. In this paper, we extend SIC to be applicable to non-linear learning.

  • Analytic Optimization of Shrinkage Parameters Based on Regularized Subspace Information Criterion

    Masashi SUGIYAMA  Keisuke SAKURAI  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E89-A No:8
      Page(s):
    2216-2225

    For obtaining a higher level of generalization capability in supervised learning, model parameters should be optimized, i.e., they should be determined in such a way that the generalization error is minimized. However, since the generalization error is inaccessible in practice, model parameters are usually determined in such a way that an estimate of the generalization error is minimized. A standard procedure for model parameter optimization is to first prepare a finite set of candidates of model parameter values, estimate the generalization error for each candidate, and then choose the best one from the candidates. If the number of candidates is increased in this procedure, the optimization quality may be improved. However, this in turn increases the computational cost. In this paper, we give methods for analytically finding the optimal model parameter value from a set of infinitely many candidates. This maximally enhances the optimization quality while the computational cost is kept reasonable.

  • Constructing Kernel Functions for Binary Regression

    Masashi SUGIYAMA  Hidemitsu OGAWA  

     
    PAPER-Pattern Recognition

      Vol:
    E89-D No:7
      Page(s):
    2243-2249

    Kernel-based learning algorithms have been successfully applied in various problem domains, given appropriate kernel functions. In this paper, we discuss the problem of designing kernel functions for binary regression and show that using a bell-shaped cosine function as a kernel function is optimal in some sense. The rationale of this result is based on the Karhunen-Loeve expansion, i.e., the optimal approximation to a set of functions is given by the principal component of the correlation operator of the functions.

  • Unsupervised Word-Sense Disambiguation Using Bilingual Comparable Corpora

    Hiroyuki KAJI  Yasutsugu MORIMOTO  

     
    PAPER-Natural Language Processing

      Vol:
    E88-D No:2
      Page(s):
    289-301

    An unsupervised method for word-sense disambiguation using bilingual comparable corpora was developed. First, it extracts word associations, i.e., statistically significant pairs of associated words, from the corpus of each language. Then, it aligns word associations by consulting a bilingual dictionary and calculates correlation between senses of a target polysemous word and its associated words, which can be regarded as clues for identifying the sense of the target word. To overcome the problem of disparity of topical coverage between corpora of the two languages as well as the problem of ambiguity in word-association alignment, an algorithm for iteratively calculating a sense-vs.-clue correlation matrix for each target word was devised. Word-sense disambiguation for each instance of the target word is done by selecting the sense that maximizes the score, i.e., a weighted sum of the correlations between each sense and clues appearing in the context of the instance. An experiment using Wall Street Journal and Nihon Keizai Shimbun corpora together with the EDR bilingual dictionary showed that the new method has promising performance; namely, the F-measure of its sense selection was 74.6% compared to a baseline of 62.8%. The developed method will possibly be extended into a fully unsupervised method that features automatic division and definition of word senses.

  • Density-Based Spam Detector

    Kenichi YOSHIDA  Fuminori ADACHI  Takashi WASHIO  Hiroshi MOTODA  Teruaki HOMMA  Akihiro NAKASHIMA  Hiromitsu FUJIKAWA  Katsuyuki YAMAZAKI  

     
    PAPER-Internet Systems

      Vol:
    E87-D No:12
      Page(s):
    2678-2688

    The volume of mass unsolicited electronic mail, often known as spam, has recently increased enormously and has become a serious threat not only to the Internet but also to society. This paper proposes a new spam detection method which uses document space density information. Although the proposed method requires extensive e-mail traffic to acquire the necessary information, it can achieve perfect detection (i.e., both recall and precision is 100%) under practical conditions. A direct-mapped cache method contributes to the handling of over 13,000 e-mail messages per second. Experimental results, which were conducted using over 50 million actual e-mail messages, are also reported in this paper.

  • A Simple Learning Algorithm for Network Formation Based on Growing Self-Organizing Maps

    Hiroki SASAMURA  Toshimichi SAITO  Ryuji OHTA  

     
    LETTER-Nonlinear Problems

      Vol:
    E87-A No:10
      Page(s):
    2807-2810

    This paper presents a simple learning algorithm for network formation. The algorithm is based on self-organizing maps with growing cell structures and can adapt input data which correspond to nodes of the network. In basic numerical experiments, as a parameter is selected suitably, our algorithm can generate network having small-world-like structure. Such network structure appears in some natural networks and has advantages in practical systems.

  • Global and Local Feature Extraction by Natural Elastic Nets

    Jiann-Ming WU  Zheng-Han LIN  

     
    LETTER-Pattern Recognition

      Vol:
    E87-D No:9
      Page(s):
    2267-2271

    This work explores generative models of handwritten digit images using natural elastic nets. The analysis aims to extract global features as well as distributed local features of handwritten digits. These features are expected to form a basis that is significant for discriminant analysis of handwritten digits and related analysis of character images or natural images.

  • Multi-Stage Unsupervised Learning for Multi-Body Motion Segmentation

    Yasuyuki SUGAYA  Kenichi KANATANI  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E87-D No:7
      Page(s):
    1935-1942

    Many techniques have been proposed for segmenting feature point trajectories tracked through a video sequence into independent motions, but objects in the scene are usually assumed to undergo general 3-D motions. As a result, the segmentation accuracy considerably deteriorates in realistic video sequences in which object motions are nearly degenerate. In this paper, we propose a multi-stage unsupervised learning scheme first assuming degenerate motions and then assuming general 3-D motions and show by simulated and real video experiments that the segmentation accuracy significantly improves without compromising the accuracy for general 3-D motions.

  • Active Learning with Model Selection -- Simultaneous Optimization of Sample Points and Models for Trigonometric Polynomial Models

    Masashi SUGIYAMA  Hidemitsu OGAWA  

     
    PAPER-Pattern Recognition

      Vol:
    E86-D No:12
      Page(s):
    2753-2763

    In supervised learning, the selection of sample points and models is crucial for acquiring a higher level of the generalization capability. So far, the problems of active learning and model selection have been independently studied. If sample points and models are simultaneously optimized, then a higher level of the generalization capability is expected. We call this problem active learning with model selection. However, active learning with model selection can not be generally solved by simply combining existing active learning and model selection techniques because of the active learning/model selection dilemma: the model should be fixed for selecting sample points and conversely the sample points should be fixed for selecting models. In this paper, we show that the dilemma can be dissolved if there is a set of sample points that is optimal for all models in consideration. Based on this idea, we give a practical procedure for active learning with model selection in trigonometric polynomial models. The effectiveness of the proposed procedure is demonstrated through computer simulations.

  • A GA-Based Learning Algorithm for Binary Neural Networks

    Masanori SHIMADA  Toshimichi SAITO  

     
    LETTER-Nonlinear Problems

      Vol:
    E85-A No:11
      Page(s):
    2544-2546

    This paper presents a flexible learning algorithm for the binary neural network that can realize a desired Boolean function. The algorithm determines hidden layer parameters using a genetic algorithm. It can reduce the number of hidden neurons and can suppress parameters dispersion. These advantages are verified by basic numerical experiments.

  • Stability of Topographic Mappings between Generalized Cell Layers

    Shouji SAKAMOTO  Youichi KOBUCHI  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E85-D No:7
      Page(s):
    1145-1152

    To elucidate the mechanism of topographic organization, we propose a simple topographic mapping formation model from generalized cell layer to generalized cell layer. Here generalized cell layer means that we consider arbitrary cell neighborhood relations. In our previous work we investigated a topographic mapping formation model between one dimensional cell layers. In this paper we extend the cell layer structure to any dimension. In our model, each cell takes a binary state value and we consider a class of learning principles which are extensions of Hebb's rule and Anti-Hebb's rule. We pay special attention to correlation type learning rules where a synaptic weight value is increased if pre and post synaptic cell states have the same value. We first show that a mapping is stable with respect to the correlational learning if and only if it is semi-embedding. Second, we introduce a special class of weight matrices called band type and show that the set of band type weight matrices is strongly closed and such a weight matrix can not yield a topographic mapping. Third, we show by computer simulations that a mapping, if it is defined by a non band type weight matrix, converges to a topographic mapping under the correlational learning rules.

  • Active Learning for Optimal Generalization in Trigonometric Polynomial Models

    Masashi SUGIYAMA  Hidemitsu OGAWA  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E84-A No:9
      Page(s):
    2319-2329

    In this paper, we consider the problem of active learning, and give a necessary and sufficient condition of sample points for the optimal generalization capability. By utilizing the properties of pseudo orthogonal bases, we clarify the mechanism of achieving the optimal generalization capability. We also show that the condition does not only provide the optimal generalization capability but also reduces the computational complexity and memory required to calculate learning result functions. Based on the optimality condition, we give design methods of optimal sample points for trigonometric polynomial models. Finally, the effectiveness of the proposed active learning method is demonstrated through computer simulations.

  • An Approach to Vehicle Recognition Using Supervised Learning

    Takeo KATO  Yoshiki NINOMIYA  

     
    PAPER

      Vol:
    E83-D No:7
      Page(s):
    1475-1479

    To enhance safety and traffic efficiency, a driver assistance system and an autonomous vehicle system are being developed. A preceding vehicle recognition method is important to develop such systems. In this paper, a vision-based preceding vehicle recognition method, based on supervised learning from sample images is proposed. The improvement for Modified Quadratic Discriminant Function (MQDF) classifier that is used in the proposed method is also shown. And in the case of road environment recognition including the preceding vehicle recognition, many researches have been reported. However in those researches, a quantitative evaluation with large number of images has rarely been done. Whereas, in this paper, over 1,000 sample images for passenger vehicles, which are recorded on a highway during daytime, are used for an evaluation. The evaluation result shows that the performance in a low order case is improved from the ordinary MQDF. Accordingly, the calculation time is reduced more than 20% by using the proposed method. And the feasibility of the proposed method is also proved, due to the result that the proposed method indicates over 98% as classification rate.

  • Realization of Admissibility for Supervised Learning

    Akira HIRABAYASHI  Hidemitsu OGAWA  Akiko NAKASHIMA  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E83-D No:5
      Page(s):
    1170-1176

    In supervised learning, one of the major learning methods is memorization learning (ML). Since it reduces only the training error, ML does not guarantee good generalization capability in general. When ML is used, however, acquiring good generalization capability is expected. This usage of ML was interpreted by one of the present authors, H. Ogawa, as a means of realizing 'true objective learning' which directly takes generalization capability into account, and introduced the concept of admissibility. If a learning method can provide the same generalization capability as a true objective learning, it is said that the objective learning admits the learning method. Hence, if admissibility does not hold, making it hold becomes important. In this paper, we introduce the concept of realization of admissibility, and devise a realization method of admissibility of ML with respect to projection learning which directly takes generalization capability into account.

  • Divergence-Based Geometric Clustering and Its Underlying Discrete Proximity Structures

    Hiroshi IMAI  Mary INABA  

     
    INVITED PAPER

      Vol:
    E83-D No:1
      Page(s):
    27-35

    This paper surveys recent progress in the investigation of the underlying discrete proximity structures of geometric clustering with respect to the divergence in information geometry. Geometric clustering with respect to the divergence provides powerful unsupervised learning algorithms, and can be applied to classifying and obtaining generalizations of complex objects represented in the feature space. The proximity relation, defined by the Voronoi diagram by the divergence, plays an important role in the design and analysis of such algorithms.

  • Synthesis and Analysis of a Digital Chaos Circuit Generating Multiple-Scroll Strange Attractors

    Kei EGUCHI  Takahiro INOUE  Akio TSUNEDA  

     
    PAPER

      Vol:
    E82-A No:6
      Page(s):
    965-972

    In this paper, a new digital chaos circuit which can generate multiple-scroll strange attractors is proposed. Being based on the piecewise-linear function which is determined by on-chip supervised learning, the proposed digital chaos circuit can generate multiple-scroll strange attractors. Hence, the proposed circuit can exhibit various bifurcation phenomena. By numerical simulations, the learning dynamics and the quasi-chaos generation of the proposed digital chaos circuit are analyzed in detail. Furthermore, as a design example of the integrated digital chaos circuit, the proposed circuit realizing the nonlinear function with five breakpoints is implemented onto the FPGA (Field Programmable Gate Array). The synthesized FPGA circuit which can generate n-scroll strange attractors (n=1, 2, 4) showed that the proposed circuit is implementable onto a single FPGA except for the SRAM.

  • A Flexible Learning Algorithm for Binary Neural Networks

    Atsushi YAMAMOTO  Toshimichi SAITO  

     
    PAPER-Neural Networks

      Vol:
    E81-A No:9
      Page(s):
    1925-1930

    This paper proposes a simple learning algorithm that can realize any boolean function using the three-layer binary neural networks. The algorithm has flexible learning functions. 1) moving "core" for the inputs separations,2) "don't care" settings of the separated inputs. The "don't care" inputs do not affect the successive separations. Performing numerical simulations on some typical examples, we have verified that our algorithm can give less number of hidden layer neurons than those by conventional ones.

41-60hit(66hit)